Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Somatic copy number alterations (sCNAs) are valuable phylogenetic markers for inferring evolutionary relationships among tumor cell subpopulations. Advances in single-cell DNA sequencing technologies are making it possible to obtain such sCNAs datasets at ever-larger scales. However, existing methods for reconstructing phylogenies from sCNAs are often too slow for large datasets. We propose two new distance-based methods,DICE-barandDICE-star, for reconstructing single-cell tumor phylogenies from sCNA data. Using carefully simulated datasets, we find that DICE-bar matches or exceeds the accuracies of all other methods on noise-free datasets and that DICE-star shows exceptional robustness to noise and outperforms all other methods on noisy datasets. Both methods are also orders of magnitude faster than many existing methods. Our experimental analysis also reveals how noise/error in copy number inference, as expected for real datasets, can drastically impact the accuracies of most methods. We apply DICE-star, the most accurate method on error-prone datasets, to several real single-cell breast and ovarian cancer datasets and find that it rapidly produces phylogenies of equivalent or greater reliability compared with existing methods.more » « lessFree, publicly-accessible full text available December 12, 2025
-
Free, publicly-accessible full text available November 22, 2025
-
Abstract MotivationAdvances in whole-genome single-cell DNA sequencing (scDNA-seq) have led to the development of numerous methods for detecting copy number aberrations (CNAs), a key driver of genetic heterogeneity in cancer. While most of these methods are limited to the inference of total copy number, some recent approaches now infer allele-specific CNAs using innovative techniques for estimating allele-frequencies in low coverage scDNA-seq data. However, these existing allele-specific methods are limited in their segmentation strategies, a crucial step in the CNA detection pipeline. ResultsWe present SEACON (Single-cell Estimation of Allele-specific COpy Numbers), an allele-specific copy number profiler for scDNA-seq data. SEACON uses a Gaussian Mixture Model to identify latent copy number states and breakpoints between contiguous segments across cells, filters the segments for high-quality breakpoints using an ensemble technique, and adopts several strategies for tolerating noisy read-depth and allele frequency measurements. Using a wide array of both real and simulated datasets, we show that SEACON derives accurate copy numbers and surpasses existing approaches under numerous experimental conditions, and identify its strengths and weaknesses. Availability and implementationSEACON is implemented in Python and is freely available open-source from https://github.com/NabaviLab/SEACON and https://doi.org/10.5281/zenodo.12727008.more » « less
-
Abstract SummaryCNAsim is a software package for improved simulation of single-cell copy number alteration (CNA) data from tumors. CNAsim can be used to efficiently generate single-cell copy number profiles for thousands of simulated tumor cells under a more realistic error model and a broader range of possible CNA mechanisms compared with existing simulators. The error model implemented in CNAsim accounts for the specific biases of single-cell sequencing that leads to read count fluctuation and poor resolution of CNA detection. For improved realism over existing simulators, CNAsim can (i) generate WGD, whole-chromosomal CNAs, and chromosome-arm CNAs, (ii) simulate subclonal population structure defined by the accumulation of chromosomal CNAs, and (iii) dilute the sampled cell population with both normal diploid cells and pseudo-diploid cells. The software can also generate DNA-seq data for sampled cells. Availability and implementationCNAsim is written in Python and is freely available open-source from https://github.com/samsonweiner/CNAsim.more » « less
An official website of the United States government
